Multilingual Word Sense Disambiguation Using Wikipedia

نویسندگان

  • Bharath Dandala
  • Rada Mihalcea
  • Razvan C. Bunescu
چکیده

We present three approaches to word sense disambiguation that use Wikipedia as a source of sense annotations. Starting from a basic monolingual approach, we develop two multilingual systems: one that uses a machine translation system to create multilingual features, and one where multilingual features are extracted primarily through the interlingual links available in Wikipedia. Experiments on four languages confirm that the Wikipedia sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. The experiments also show that the multilingual systems obtain on average a substantial relative error reduction when compared to the monolingual systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SemEval-2013 Task 12: Multilingual Word Sense Disambiguation

This paper presents the SemEval-2013 task on multilingual Word Sense Disambiguation. We describe our experience in producing a multilingual sense-annotated corpus for the task. The corpus is tagged with BabelNet 1.1.1, a freely-available multilingual encyclopedic dictionary and, as a byproduct, WordNet 3.0 and the Wikipedia sense inventory. We present and analyze the results of participating sy...

متن کامل

Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources

In this paper we present the mapping between WordNet domains and WordNet topics, and the emergent Wikipedia categories. This mapping leads to a coarse alignment between WordNet and Wikipedia, useful for producing domain-specific and multilingual corpora. Multilinguality is achieved through the cross-language links between Wikipedia categories. Research in word-sense disambiguation has shown tha...

متن کامل

Automatic Identification and Disambiguation of Concepts and Named Entities in the Multilingual Wikipedia

In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network...

متن کامل

DAEBAK!: Peripheral Diversity for Multilingual Word Sense Disambiguation

We introduce Peripheral Diversity (PD) as a knowledge-based approach to achieve multilingual Word Sense Disambiguation (WSD). PD exploits the frequency and diverse use of word senses in semantic subgraphs derived from larger sense inventories such as BabelNet, Wikipedia, and WordNet in order to achieve WSD. PD’s f -measure scores for SemEval 2013 Task 12 outperform the Most Frequent Sense (MFS)...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013